{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 18.2 Basic operations on B-trees"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 18.2-1\n",
    "\n",
    "> Show the results of inserting the keys \n",
    "\n",
    "> $F, S, Q, K, C, L, H, T, V, W, M, R, N, P, A, B, X, Y, D, Z, E$\n",
    "\n",
    "> in order into an empty B-tree with minimum degree 2. Draw only the configurations of the tree just before some node must split, and also draw the final configuration."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "![](./img/18.2-1_1.png)\n",
    "\n",
    "![](./img/18.2-1_2.png)\n",
    "\n",
    "![](./img/18.2-1_3.png)\n",
    "\n",
    "![](./img/18.2-1_4.png)\n",
    "\n",
    "![](./img/18.2-1_5.png)\n",
    "\n",
    "![](./img/18.2-1_6.png)\n",
    "\n",
    "![](./img/18.2-1_7.png)\n",
    "\n",
    "![](./img/18.2-1_8.png)\n",
    "\n",
    "![](./img/18.2-1_9.png)\n",
    "\n",
    "![](./img/18.2-1_10.png)\n",
    "\n",
    "![](./img/18.2-1_11.png)\n",
    "\n",
    "![](./img/18.2-1_12.png)\n",
    "\n",
    "![](./img/18.2-1_13.png)\n",
    "\n",
    "![](./img/18.2-1_14.png)\n",
    "\n",
    "![](./img/18.2-1_15.png)\n",
    "\n",
    "![](./img/18.2-1_16.png)\n",
    "\n",
    "![](./img/18.2-1_17.png)\n",
    "\n",
    "![](./img/18.2-1_18.png)\n",
    "\n",
    "![](./img/18.2-1_19.png)\n",
    "\n",
    "![](./img/18.2-1_20.png)\n",
    "\n",
    "![](./img/18.2-1_21.png)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 18.2-2\n",
    "\n",
    "> Explain under what circumstances, if any, redundant DISK-READ or DISK-WRITE operations occur during the course of executing a call to B-TREE-INSERT. (A redundant DISK-READ is a DISK-READ for a page that is already in memory. A redundant DISK-WRITE writes to disk a page of information that is identical to what is already stored there.)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "No redundant."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 18.2-3\n",
    "\n",
    "> Explain how to find the minimum key stored in a B-tree and how to find the predecessor of a given key stored in a B-tree."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "class BTreeNode:\n",
    "    def __init__(self, t):\n",
    "        self.n = 0\n",
    "        self.key = [None] * (2 * t - 1)\n",
    "        self.c = [None] * (2 * t)\n",
    "        self.leaf = True\n",
    "\n",
    "\n",
    "class BTree:\n",
    "    def __init__(self, degree):\n",
    "        self.t = degree\n",
    "        self.root = BTreeNode(degree)\n",
    "\n",
    "    def disk_read(self, x):\n",
    "        pass\n",
    "\n",
    "    def disk_write(self, x):\n",
    "        pass\n",
    "\n",
    "    def split_child(self, x, i):\n",
    "        t = self.t\n",
    "        z = BTreeNode(t)\n",
    "        y = x.c[i]\n",
    "        z.leaf = y.leaf\n",
    "        z.n = t - 1\n",
    "        for j in range(t - 1):\n",
    "            z.key[j] = y.key[j + t]\n",
    "        if not y.leaf:\n",
    "            for j in range(t):\n",
    "                z.c[j] = y.c[j + t]\n",
    "        y.n = t - 1\n",
    "        for j in range(x.n, i - 1, -1):\n",
    "            x.c[j + 1] = x.c[j]\n",
    "        x.c[i + 1] = z\n",
    "        for j in range(x.n - 1, i - 2, -1):\n",
    "            x.key[j + 1] = x.key[j]\n",
    "        x.key[i] = y.key[t - 1]\n",
    "        x.n += 1\n",
    "        self.disk_write(y)\n",
    "        self.disk_write(z)\n",
    "        self.disk_write(x)\n",
    "\n",
    "    def insert(self, k):\n",
    "        t = self.t\n",
    "        r = self.root\n",
    "        if r.n == 2 * t - 1:\n",
    "            s = BTreeNode(t)\n",
    "            self.root = s\n",
    "            s.leaf = False\n",
    "            s.n = 0\n",
    "            s.c[0] = r\n",
    "            self.split_child(s, 0)\n",
    "            self.insert_nonfull(s, k)\n",
    "        else:\n",
    "            self.insert_nonfull(r, k)\n",
    "\n",
    "    def insert_nonfull(self, x, k):\n",
    "        t = self.t\n",
    "        i = x.n - 1\n",
    "        if x.leaf:\n",
    "            while i >= 0 and k < x.key[i]:\n",
    "                x.key[i + 1] = x.key[i]\n",
    "                i -= 1\n",
    "            x.key[i + 1] = k\n",
    "            x.n += 1\n",
    "            self.disk_write(x)\n",
    "        else:\n",
    "            while i >= 0 and k < x.key[i]:\n",
    "                i -= 1\n",
    "            i += 1\n",
    "            self.disk_read(x.c[i])\n",
    "            if x.c[i].n == 2 * t - 1:\n",
    "                self.split_child(x, i)\n",
    "                if k > x.key[i]:\n",
    "                    i += 1\n",
    "            self.insert_nonfull(x.c[i], k)\n",
    "\n",
    "    def minimum(self):\n",
    "        def minimum_sub(x):\n",
    "            if x is None:\n",
    "                return None\n",
    "            if x.n > 0 and x.c[0] is not None:\n",
    "                return minimum_sub(x.c[0])\n",
    "            return x.key[0]\n",
    "        if self.root.n == 0:\n",
    "            return None\n",
    "        return minimum_sub(self.root)\n",
    "\n",
    "    def predecessor(self, k):\n",
    "        def predecessor_sub(x):\n",
    "            if x is None:\n",
    "                return None\n",
    "            for i in xrange(x.n - 1, -1, -1):\n",
    "                if k > x.key[i]:\n",
    "                    c = predecessor_sub(x.c[i + 1])\n",
    "                    if c is None:\n",
    "                        return x.key[i]\n",
    "                    return max(c, x.key[i])\n",
    "            return predecessor_sub(x.c[0])\n",
    "        if self.root.n == 0:\n",
    "            return None\n",
    "        return predecessor_sub(self.root)\n",
    "\n",
    "    def successor(self, k):\n",
    "        def successor_sub(x):\n",
    "            if x is None:\n",
    "                return None\n",
    "            for i in xrange(x.n):\n",
    "                if k < x.key[i]:\n",
    "                    c = successor_sub(x.c[i])\n",
    "                    if c is None:\n",
    "                        return x.key[i]\n",
    "                    return min(c, x.key[i])\n",
    "            return successor_sub(x.c[x.n])\n",
    "        if self.root.n == 0:\n",
    "            return None\n",
    "        return successor_sub(self.root)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 18.2-4 $\\star$\n",
    "\n",
    "> Suppose that we insert the keys $\\{1, 2, \\dots, n\\}$ into an empty B-tree with minimum degree 2. How many nodes does the final B-tree have?"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "At least $n - 2\\lg(n+1)$."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 18.2-5\n",
    "\n",
    "> Since leaf nodes require no pointers to children, they could conceivably use a different (larger) $t$ value than internal nodes for the same disk page size. Show how to modify the procedures for creating and inserting into a B-tree to handle this variation."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$\\dots$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 18.2-6\n",
    "\n",
    "> Suppose that we were to implement B-TREE-SEARCH to use binary search rather than linear search within each node. Show that this change makes the CPU time required $O(\\lg n)$, independently of how $t$ might be chosen as a function of $n$."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$\\log_{t} n \\cdot \\lg t = \\lg n$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 18.2-7\n",
    "\n",
    "> Suppose that disk hardware allows us to choose the size of a disk page arbitrarily, but that the time it takes to read the disk page is $a + bt$, where $a$ and $b$ are specified constants and $t$ is the minimum degree for a B-tree using pages of the selected size. Describe how to choose $t$ so as to minimize (approximately) the B-tree search time. Suggest an optimal value of $t$ for the case in which $a = 5$ milliseconds and $b = 10$ microseconds."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$$\n",
    "\\min \\log_t n \\cdot (a + bt) = \\min \\frac{a + bt}{\\ln t}\n",
    "$$\n",
    "\n",
    "$$\n",
    "\\frac{\\partial}{\\partial t} \\left ( \\frac{a + bt}{\\ln t} \\right ) = - \\frac{a + bt - bt \\ln t}{t \\ln^2 t}\n",
    "$$\n",
    "\n",
    "$$\n",
    "\\begin{array}{rll}\n",
    "a + bt &=& bt \\ln t \\\\\n",
    "5 + 10t &=& 10t \\ln t \\\\\n",
    "t &=& \\displaystyle e^{W \\left ( \\frac{1}{2e} \\right ) + 1} \\\\\n",
    "\\end{array}\n",
    "$$\n",
    "\n",
    "where $W$ is the LambertW function, and we should choose $t=3$."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
